Barcelona’s Digital Landscape: a data-driven exploration of urban dynamics around Sagrada Familia. AI-generated by our team.
A tourism company based in Zürich, Switzerland, has observed a significant increase in travel demand to Barcelona in recent years. Indeed, Barcelona ranks as the third most in-demand city for Airbnb rentals in Europe, behind Paris and London.
Consequently, the company’s manager has requested a Machine Learning study and analysis of Airbnb accommodations in the city. The goal is to understand price behavior and identify the factors influencing accommodation costs and occupancy, enabling the company to provide optimal responses to clients’ inquiries.
To achieve this goal, the team has decided to analyse and address three question to provide comprehensive insights for the manager.
Barcelona is one of the most visited cities in Europe, and the rise of Airbnb and other short-term rental platforms has led to a notable increase in tourism. However, this growth also presents challenges for accommodation businesses and the local housing market. A study conducted by the Social Science Research Network (SSRN, link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3428237) revealed that rental costs in neighborhoods with high Airbnb activity increased by 7% between 2009 and 2016. This is primarily due to the fact that property owners, motivated by the demand from tourists seeking short-term rentals, frequently opt to lease their properties at higher rates during the short term rather than committing to long-term leases.
For these reasons, it is crucial for tourism companies to understand this dynamic market to remain competitive and provide tailored services to their clients.
The Zürich-based tourism company needs reliable data on Airbnb prices and occupancy rates to make data-driven recommendations and stay ahead of competitors.
This analysis is for educational purposes only. The findings are based on public data and are not professional advice. The results should not be used for business or policy decisions.
To conduct the study, the team has decided to analyse a dataset of Barcelona Airbnbs available on the Kaggle website (link: https://www.kaggle.com/datasets/fermatsavant/airbnb-dataset-of-barcelona-city)
The dataset consists of 19.833 observations across 25 variables, including geographical zones, amenities, prices, and accommodations.
Below, we can see the structure of the dataset, and the names and data types for each column.
## Rows: 19,833
## Columns: 25
## $ X <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14…
## $ id <int> 18666, 18674, 21605, 23197, 25786, 31377, 31380,…
## $ host_id <int> 71615, 71615, 82522, 90417, 108310, 134698, 1346…
## $ host_is_superhost <chr> "f", "f", "f", "t", "t", "f", "f", "f", "f", "f"…
## $ host_listings_count <int> 45, 45, 2, 5, 1, 9, 9, 41, 41, 1, NA, 3, 3, 4, 4…
## $ neighbourhood <chr> "Sant Martí", "La Sagrada Família", "Sant Martí"…
## $ zipcode <chr> "8026", "8025", "8018", "8930", "8012", "8025", …
## $ latitude <dbl> 41.40889, 41.40420, 41.40560, 41.41203, 41.40145…
## $ longitude <dbl> 2.18555, 2.17306, 2.19821, 2.22114, 2.15645, 2.1…
## $ property_type <chr> "Apartment", "Apartment", "Apartment", "Apartmen…
## $ room_type <chr> "Entire home/apt", "Entire home/apt", "Private r…
## $ accommodates <int> 6, 8, 2, 6, 2, 2, 3, 4, 5, 1, 6, 2, 8, 2, 1, 1, …
## $ bathrooms <dbl> 1.0, 2.0, 1.0, 2.0, 1.0, 1.0, 1.0, 1.0, 1.5, 1.0…
## $ bedrooms <int> 2, 3, 1, 3, 1, 1, 1, 1, 3, 1, 2, 1, 4, 1, 1, 1, …
## $ beds <int> 4, 6, 1, 8, 1, 2, 2, 1, 3, 1, 7, 1, 6, 1, 1, 1, …
## $ amenities <chr> "['TV', 'Internet', 'Wifi', 'Air conditioning', …
## $ price <chr> "$130.00", "$60.00", "$33.00", "$210.00", "$45.0…
## $ minimum_nights <int> 3, 1, 2, 3, 1, 3, 3, 1, 1, 29, 2, 4, 5, 2, 2, 2,…
## $ has_availability <chr> "t", "t", "t", "t", "t", "t", "t", "t", "t", "t"…
## $ availability_30 <int> 0, 3, 4, 11, 8, 5, 3, 2, 3, 4, 2, 25, 9, 1, 3, 2…
## $ availability_60 <int> 0, 20, 8, 33, 19, 8, 8, 17, 19, 4, 16, 55, 31, 3…
## $ availability_90 <int> 0, 50, 15, 63, 41, 16, 15, 29, 31, 4, 42, 80, 61…
## $ availability_365 <int> 182, 129, 15, 318, 115, 211, 211, 266, 257, 26, …
## $ number_of_reviews_ltm <int> 0, 10, 36, 16, 49, 0, 2, 34, 15, 0, 10, 0, 24, 6…
## $ review_scores_rating <int> 80, 87, 90, 95, 95, 95, 87, 92, 88, 99, 87, 68, …
X: numerical index for rowsid: unique identifier for
listingshost_id: unique identifier for
hostshost_listings_count: number of
listings by the hostlatitude: geographic latitude of the
listinglongitude: geographic longitude of the
listingaccommodates: number of guests the
listing can accommodatebathrooms: number of bathrooms in the
listingbedrooms: number of bedrooms in the
listingbeds: number of beds in the
listingminimum_nights: minimum number of
nights required for bookingavailability_30: number of available
nights in the next 30 daysavailability_60: number of available
nights in the next 60 daysavailability_90: number of available
nights in the next 90 daysavailability_365: number of available
nights in the next 365 daysnumber_of_reviews_ltm: number of
reviews in the last 12 monthsreview_scores_rating: average review
rating scorehost_is_superhost: indicates if the
host is a superhost ("t" or "f")has_availability: indicates if the
listing is available for booking ("t" or
"f")neighbourhood: name of the
neighbourhood where the listing is locatedzipcode: postal code of the
listingproperty_type: type of property (e.g.,
“Apartment”)room_type: type of room (e.g., “Entire
home/apt”)amenities: list of amenities provided
in the listingprice: price of the listing as a
string (e.g., “$130.00”)As can be appreciated, the variable ‘price’ has a ‘Character’ data type. Therefore, in the chapter 3, this field will be transformed into an integer variable to enable the necessary calculations.
In order to streamline the calculations and analysis, a sub-dataset will be created in the following steps, considering 10.000 observations selected randomly. Additionally, a seed is created to ensure the same observations are maintained throughout the analysis
## Rows: 10,000
## Columns: 25
## $ X <int> 16886, 3429, 3695, 3051, 11158, 8191, 18373, 172…
## $ id <int> 34191752, 6787210, 7555948, 5767967, 24448300, 1…
## $ host_id <int> 224372816, 15681396, 3911721, 2151490, 163379623…
## $ host_is_superhost <chr> "t", "f", "f", "f", "f", "f", "f", "f", "f", "t"…
## $ host_listings_count <int> 9, 6, 39, 6, 109, 1, 0, 16, 32, 2, 32, 2, 91, 1,…
## $ neighbourhood <chr> "Ciutat Vella", "La Verneda i La Pau", "Camp d'e…
## $ zipcode <chr> "8001", "8020", "8025", "8001", "8014", "8041", …
## $ latitude <dbl> 41.37876, 41.42130, 41.40366, 41.38471, 41.37340…
## $ longitude <dbl> 2.16882, 2.20321, 2.17096, 2.16538, 2.14020, 2.1…
## $ property_type <chr> "Apartment", "Apartment", "Apartment", "Apartmen…
## $ room_type <chr> "Entire home/apt", "Private room", "Entire home/…
## $ accommodates <int> 5, 5, 6, 16, 4, 4, 1, 2, 2, 4, 4, 1, 1, 2, 2, 2,…
## $ bathrooms <dbl> 1.0, 1.0, 1.0, 6.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.5…
## $ bedrooms <int> 2, 2, 2, 7, 2, 1, 1, 0, 1, 1, 2, 3, 1, 1, 1, 1, …
## $ beds <int> 5, 5, 5, 13, 2, 2, 1, 1, 2, 2, 3, 1, 1, 1, 1, 2,…
## $ amenities <chr> "['TV', 'Cable TV', 'Wifi', 'Air conditioning', …
## $ price <chr> "$105.00", "$25.00", "$85.00", "$899.00", "$83.0…
## $ minimum_nights <int> 32, 1, 1, 2, 1, 2, 1, 1, 3, 1, 3, 30, 31, 3, 3, …
## $ has_availability <chr> "t", "t", "t", "t", "t", "t", "t", "t", "t", "t"…
## $ availability_30 <int> 9, 18, 21, 3, 2, 0, 10, 17, 19, 17, 15, 9, 9, 27…
## $ availability_60 <int> 39, 48, 51, 24, 16, 0, 30, 47, 23, 47, 45, 39, 3…
## $ availability_90 <int> 40, 78, 71, 47, 26, 0, 60, 77, 42, 59, 75, 69, 6…
## $ availability_365 <int> 40, 353, 327, 241, 297, 0, 335, 352, 127, 64, 16…
## $ number_of_reviews_ltm <int> 0, 12, 2, 7, 0, 0, 2, 4, 9, 21, 2, 0, 0, 0, 13, …
## $ review_scores_rating <int> NA, 83, 90, 92, NA, 96, 100, 90, 86, 97, 90, 95,…
To address the research question, the study will be divided into three parts. First, an Exploratory Data Analysis (EDA) will be conducted to gain a deeper understanding of the data. Second, Machine Learning models will be implemented, and their performance will be evaluated to identify the best-performing model. Finally, the selected model will be used to provide the most accurate answer to the research question posed by the team.
The different models to be developed are:
First, the pricing variable will be converted into a numeric format, and in the fifth chapter of this report (Machine Learning Models), the categorical variables will be transformed into factors for further analysis and modeling.
BCN_Accomm_sub$price <- gsub(",", "", BCN_Accomm_sub$price) # removed ','
BCN_Accomm_sub$price <- gsub("\\$", "", BCN_Accomm_sub$price) # removed '$' sign
BCN_Accomm_sub$price <- as.numeric(BCN_Accomm_sub$price) # converted to number format
## X id host_id
## 0 0 0
## host_is_superhost host_listings_count neighbourhood
## 0 22 0
## zipcode latitude longitude
## 0 0 0
## property_type room_type accommodates
## 0 0 0
## bathrooms bedrooms beds
## 6 2 18
## amenities price minimum_nights
## 0 0 0
## has_availability availability_30 availability_60
## 0 0 0
## availability_90 availability_365 number_of_reviews_ltm
## 0 0 0
## review_scores_rating
## 2415
## X id host_id host_is_superhost neighbourhood zipcode latitude longitude
## 7552 1 1 1 1 1 1 1 1
## 2401 1 1 1 1 1 1 1 1
## 17 1 1 1 1 1 1 1 1
## 5 1 1 1 1 1 1 1 1
## 9 1 1 1 1 1 1 1 1
## 8 1 1 1 1 1 1 1 1
## 4 1 1 1 1 1 1 1 1
## 1 1 1 1 1 1 1 1 1
## 1 1 1 1 1 1 1 1 1
## 2 1 1 1 1 1 1 1 1
## 0 0 0 0 0 0 0 0
## property_type room_type accommodates amenities price minimum_nights
## 7552 1 1 1 1 1 1
## 2401 1 1 1 1 1 1
## 17 1 1 1 1 1 1
## 5 1 1 1 1 1 1
## 9 1 1 1 1 1 1
## 8 1 1 1 1 1 1
## 4 1 1 1 1 1 1
## 1 1 1 1 1 1 1
## 1 1 1 1 1 1 1
## 2 1 1 1 1 1 1
## 0 0 0 0 0 0
## has_availability availability_30 availability_60 availability_90
## 7552 1 1 1 1
## 2401 1 1 1 1
## 17 1 1 1 1
## 5 1 1 1 1
## 9 1 1 1 1
## 8 1 1 1 1
## 4 1 1 1 1
## 1 1 1 1 1
## 1 1 1 1 1
## 2 1 1 1 1
## 0 0 0 0
## availability_365 number_of_reviews_ltm bedrooms bathrooms beds
## 7552 1 1 1 1 1
## 2401 1 1 1 1 1
## 17 1 1 1 1 1
## 5 1 1 1 1 1
## 9 1 1 1 1 0
## 8 1 1 1 1 0
## 4 1 1 1 0 1
## 1 1 1 1 0 1
## 1 1 1 1 0 0
## 2 1 1 0 1 1
## 0 0 2 6 18
## host_listings_count review_scores_rating
## 7552 1 1 0
## 2401 1 0 1
## 17 0 1 1
## 5 0 0 2
## 9 1 1 1
## 8 1 0 2
## 4 1 1 1
## 1 1 0 2
## 1 1 1 2
## 2 1 1 1
## 22 2415 2463
| Missing_Count | Missing_Percent | |
|---|---|---|
| review_scores_rating | 2415 | 24.15% |
| host_listings_count | 22 | 0.22% |
| beds | 18 | 0.18% |
| bathrooms | 6 | 0.06% |
| bedrooms | 2 | 0.02% |
| X | 0 | 0% |
| id | 0 | 0% |
| host_id | 0 | 0% |
| host_is_superhost | 0 | 0% |
| neighbourhood | 0 | 0% |
| zipcode | 0 | 0% |
| latitude | 0 | 0% |
| longitude | 0 | 0% |
| property_type | 0 | 0% |
| room_type | 0 | 0% |
| accommodates | 0 | 0% |
| amenities | 0 | 0% |
| price | 0 | 0% |
| minimum_nights | 0 | 0% |
| has_availability | 0 | 0% |
| availability_30 | 0 | 0% |
| availability_60 | 0 | 0% |
| availability_90 | 0 | 0% |
| availability_365 | 0 | 0% |
| number_of_reviews_ltm | 0 | 0% |
host_listings_count : since is not possible to make any calculation on the number of listing of the host, we exclude the 22 rows that lack of it. bathrooms : the number of bathrooms is missing in 6 rows. beds : the number of beds is not specified for 18 assets. bedrooms : 2 rows contains missing value and can be deleted. review_scores_rating : the review score rating is missing in 2415 rows of 10000. It’s a quite relevant percentage, around the 24% of the data we selected. In this case we decide to impute the missing values replacing it with the value 0.
## X id host_id host_is_superhost neighbourhood zipcode latitude longitude
## 9953 1 1 1 1 1 1 1 1
## 22 1 1 1 1 1 1 1 1
## 17 1 1 1 1 1 1 1 1
## 5 1 1 1 1 1 1 1 1
## 1 1 1 1 1 1 1 1 1
## 2 1 1 1 1 1 1 1 1
## 0 0 0 0 0 0 0 0
## property_type room_type accommodates amenities price minimum_nights
## 9953 1 1 1 1 1 1
## 22 1 1 1 1 1 1
## 17 1 1 1 1 1 1
## 5 1 1 1 1 1 1
## 1 1 1 1 1 1 1
## 2 1 1 1 1 1 1
## 0 0 0 0 0 0
## has_availability availability_30 availability_60 availability_90
## 9953 1 1 1 1
## 22 1 1 1 1
## 17 1 1 1 1
## 5 1 1 1 1
## 1 1 1 1 1
## 2 1 1 1 1
## 0 0 0 0
## availability_365 number_of_reviews_ltm review_scores_rating bedrooms
## 9953 1 1 1 1
## 22 1 1 1 1
## 17 1 1 1 1
## 5 1 1 1 1
## 1 1 1 1 1
## 2 1 1 1 0
## 0 0 0 2
## bathrooms beds host_listings_count
## 9953 1 1 1 0
## 22 1 1 0 1
## 17 1 0 1 1
## 5 0 1 1 1
## 1 0 0 1 2
## 2 1 1 1 1
## 6 18 22 48
Eventually, the other fields that present missing values do not allow to replace the empty data with estimates (average, mean, …) so called imputation. This fields include in total 48 assets that represent 0.48% of the total assets and allows to delete the entire rows without loosing too many information.
# Create a correlation matrix for numeric fields
cor_BNC_Accomm <- select_if(BCN_Accomm, is.numeric) %>%
select(-c(id, X, host_id))
# make a data frame
cor_BNC_Accomm <- data.frame(cor_BNC_Accomm)
str(cor_BNC_Accomm)
## 'data.frame': 9953 obs. of 15 variables:
## $ host_listings_count : int 9 6 39 6 109 1 0 16 32 2 ...
## $ latitude : num 41.4 41.4 41.4 41.4 41.4 ...
## $ longitude : num 2.17 2.2 2.17 2.17 2.14 ...
## $ accommodates : int 5 5 6 16 4 4 1 2 2 4 ...
## $ bathrooms : num 1 1 1 6 1 1 1 1 1 1.5 ...
## $ bedrooms : int 2 2 2 7 2 1 1 0 1 1 ...
## $ beds : int 5 5 5 13 2 2 1 1 2 2 ...
## $ price : num 105 25 85 899 83 45 29 50 165 70 ...
## $ minimum_nights : int 32 1 1 2 1 2 1 1 3 1 ...
## $ availability_30 : int 9 18 21 3 2 0 10 17 19 17 ...
## $ availability_60 : int 39 48 51 24 16 0 30 47 23 47 ...
## $ availability_90 : int 40 78 71 47 26 0 60 77 42 59 ...
## $ availability_365 : int 40 353 327 241 297 0 335 352 127 64 ...
## $ number_of_reviews_ltm: int 0 12 2 7 0 0 2 4 9 21 ...
## $ review_scores_rating : num 0 83 90 92 0 96 100 90 86 97 ...
# print correlation matrix
corrplot(cor(cor_BNC_Accomm), type = "upper", order = "hclust", tl.col = "black")
From the correlation matrix is possible to deduct the following characteristics: - there’s almost no correlation between availability periods and number of beds, bedrooms, bathrooms. It would suggest that the availability of the house do not depend from those features, rather probably from the location and facilities. - there is a positive correlation between number of bedrooms, beds and bathrooms. - there is a positive correlation between the availability periods.
Since the price variable is a key focus of our analysis, an outlier analysis of this variable has been conducted.
## [1] 804
Out of a total of 10,000 values, 804 (8.04%) are identified as outliers. Below, a boxplot is presented to visualize the median and the outlier observations.
From the boxplot above, it can be concluded that the median price is approximately 65€ per night, with 50% of the observations concentrated between 40€ (25th percentile) and 112€ (75th percentile), representing the interquartile range (IQR).
Additionally, the presence of numerous outliers extending to the right indicates a right-skewed distribution, meaning higher prices are influencing the dataset.
The significant number of observations with higher prices could suggest the presence of many luxury properties. Therefore, further analysis is required to identify the factors influencing these price variations.
| neighbourhood | property_type | bedrooms | price |
|---|---|---|---|
| Gràcia | Bed and breakfast | 1 | 8000 |
| Sants-Montjuïc | Boat | 4 | 8000 |
| Vila de Gràcia | Bed and breakfast | 1 | 8000 |
| Vila de Gràcia | Bed and breakfast | 1 | 8000 |
| Vila de Gràcia | Bed and breakfast | 1 | 8000 |
| Vila de Gràcia | Bed and breakfast | 1 | 8000 |
| Eixample | Boutique hotel | 1 | 6000 |
| Eixample | Hotel | 1 | 6000 |
| Eixample | Hotel | 1 | 6000 |
| Eixample | Hotel | 1 | 6000 |
| Eixample | Hotel | 1 | 6000 |
| Eixample | Hotel | 1 | 6000 |
| La Nova Esquerra de l’Eixample | Hotel | 1 | 6000 |
| La Nova Esquerra de l’Eixample | Hotel | 1 | 6000 |
| La Nova Esquerra de l’Eixample | Hotel | 1 | 6000 |
| La Nova Esquerra de l’Eixample | Hotel | 1 | 6000 |
| Sant Antoni | Boutique hotel | 1 | 6000 |
| Sant Antoni | Boutique hotel | 1 | 6000 |
| Sant Antoni | Hotel | 1 | 6000 |
| Sant Antoni | Hotel | 1 | 6000 |
It seems that the price variable may contain erroneous entries. For further analysis, research revealed that the average nightly rate for an Airbnb in Barcelona is €93 (according to Hostel Geeks, link: https://hostelgeeks.com/best-airbnbs-in-barcelona-spain/). Therefore, prices of €8,000 are likely errors. As a result, it was decided to exclude prices above €1,000 from the analysis.
The new summary for tha Price variable is the following:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 7.00 40.00 65.00 96.93 110.00 1000.00
Now, a new Correlation Matrix with the filtered observations of the price variable is displayed.
## 'data.frame': 9953 obs. of 15 variables:
## $ host_listings_count : int 9 6 39 6 109 1 0 16 32 2 ...
## $ latitude : num 41.4 41.4 41.4 41.4 41.4 ...
## $ longitude : num 2.17 2.2 2.17 2.17 2.14 ...
## $ accommodates : int 5 5 6 16 4 4 1 2 2 4 ...
## $ bathrooms : num 1 1 1 6 1 1 1 1 1 1.5 ...
## $ bedrooms : int 2 2 2 7 2 1 1 0 1 1 ...
## $ beds : int 5 5 5 13 2 2 1 1 2 2 ...
## $ price : num 105 25 85 899 83 45 29 50 165 70 ...
## $ minimum_nights : int 32 1 1 2 1 2 1 1 3 1 ...
## $ availability_30 : int 9 18 21 3 2 0 10 17 19 17 ...
## $ availability_60 : int 39 48 51 24 16 0 30 47 23 47 ...
## $ availability_90 : int 40 78 71 47 26 0 60 77 42 59 ...
## $ availability_365 : int 40 353 327 241 297 0 335 352 127 64 ...
## $ number_of_reviews_ltm: int 0 12 2 7 0 0 2 4 9 21 ...
## $ review_scores_rating : num 0 83 90 92 0 96 100 90 86 97 ...
We can observe that the price variable is now most correlated with variables related to the size and capacity of an Airbnb, such as bathrooms, bedrooms, number of beds, and accommodates.
On the other hand, general availability and review scores have little impact on the price.
From the histograms above, several variables exhibit right-skewed distributions, including price, minimum_nights, and number_of_reviews_ltm.
On the other hand, the data suggests that in Barcelona, Airbnb listings are primarily designed for small groups of people seeking short-term stays. Additionally, these accommodations tend to receive high review scores, indicating good guest satisfaction with the different properties.
Almost 19% of the hosts offering an Airbnb in Barcelona are not categorized as Superhosts. This means tourists can find accommodations in the city where hosts go above and beyond to provide excellent hospitality. This insight could be a key factor in explaining the higher price values observed in certain neighborhoods.
This section presents a variety of plots for the categorical variables, including Property Type, Room Type, Top Neighbourhoods, and Amenities.
From the plot above, it can be observed that apartments dominate the Airbnb market in Barcelona, accounting for 86% of the listings.
On the other hand, the low availability of luxury or specialized accommodations, such as Boutique Hotels (0.5%), Guest Suites (0.7%), and Lofts (2.4%), suggests that these property types cater to a niche market. Travelers opting for these accommodations are likely visiting Barcelona for specific reasons, such as work or unique travel experiences.
The majority of Room Type are split between Entire home/Apartment and Private Room.
Less than 1% of the hosts offer Shared Room, which suggests that travellers prefer more privacy during the stay.
The Eixample district of Barcelona represents the most popular neighbourhood on Airbnb, with 27% of the total listings. This is followed by Ciutat Vella, which accounts for 18.8% of the listings.
Eixample is situated in close proximity to the historic centre of the city and is more centrally located in comparison to other neighbourhoods. The area offers many attractions for tourists, including La Sagrada Familia, Casa Batlló, and Passeig de Gràcia. In addition to its excellent transport connections, Eixample is an ideal destination for visitors.
On the other hand, Ciutat Vella is the oldest part of Barcelona and serves as the heart of the city, known for its historical charm and vibrant cultural scene.
Given this, the Tourism Company in Zürich could recommend that its clients focus on these neighbourhoods to attract more customers and enhance their travel experience.
The Wordcloud above, provides the most common amenities offered by the different hosts.
The most prominent amenities are: Kitchen, Wifi, Heating, Washer and Hair dryer. his can be taken to indicate that tourists may consider a place to be comfortable for their stay if it meets these basic requirements.
This section was designed to allow the employees of the Tourism Company and other Users to interact with the data on neighborhoods, prices, and reviews.
The purpose of the following interactive plot is to allow users to select a neighborhood of interest and visualize, on a map, the different accommodations available along with their price per night when one of the circles is clicked.
In the heatmap below, users can observe the zones with higher accommodation prices (red/orange areas).
In contrast, the zones colored in green or blue represent lower-priced neighborhoods.
According to the heatmap, the Tourism Company can recommend the red zones to tourists looking for more centralized accommodations, regardless of price. On the other hand, tourists who want to save money can be advised to choose accommodations in the green or blue areas, which are typically farther from the city center.
From the plot above, the following insights can be derived: - The majority of listings are concentrated at the lower price range (below 250 Euros), irrespective of room type. - Accommodations with high review scores (exceeding 90 points) are distributed across all price categories, indicating that well-reviewed Airbnbs are not restricted to a particular room type or price range.
In this chapter, different machine learning models will be explored to predict Airbnb prices and the occupancy rate over the next 30 days.
The formula to calculate the Occupancy rate in 30 days is:
Occupancy Rate: \[ \text{Occupancy Rate} = \left(1 - \frac{\text{Availability 30 Days}}{\text{Total Days = 30}}\right) \times 100 \]
According to the formula above, a new columns with the Occupancy rate is calculated and the head data of the new variable Occupancy_rate_30 is:
## [1] 70.00000 40.00000 30.00000 90.00000 93.33333 100.00000
With this new predictor, the Occupancy rate in the next 30 days is going to be predicted.
As mentioned in the previous chapters, the categorical variables are converted into factors to proceed with the modeling phase.
Before analysing the different models, we need to divide the data into a training set and test set. The first set will be used to find the relationship between dependent and independent variable, while the second set will be used to analyse the performance of the models. We decide to use 60% of the data set as a training set, and the rest as a test set.
Variables for Pricing Model
Next, based on the Correlation Matrix, showed in chapter 3.4, the variables used to address the first reasearch question about price, are:
bedrromsbathroomsaccomodatesbedslatitude and longitudereview_score_ratingminimum_nightsproperty_typeroom_typeneighbourhoodVariables for occupancy Rate Model
The variables used for the Occupancy rate in one month are:
latitude and
longitude (location)bathroomsbedroomsaccommodatesbedspriceminimum_nightsreview_score_ratingneighbourhoodIt is used to analyse …. and answer the project question x - Accurancy - Precision - Recall - RMSE (Root Mean Squared Error) - MAE (Mean Absolute Error) - R Squared
It is used to analyse …. and answer the project question x - Accurancy - Precision - Recall - RMSE (Root Mean Squared Error) - MAE (Mean Absolute Error) - R Squared
In this chapter, Generalized Additive Models (GAMs) will be applied with the Price variable as the response, to analyze its interactions with the predictor variables.
The first goal is to identify key factors influencing prices in Barcelona, addressing Research Question 1 (What are the key factors influencing accommodation prices in Barcelona?)
First, we aim to determine whether a nonlinear relationship exists between the independent variables and price. To explore this, the variable Review Score Rating will be plotted against Price to visualize whether the relationship is linear or not.
## `geom_smooth()` using formula = 'y ~ x'
From the plot above and Chapter 4.3 of this report, we observe that many points are concentrated on the right side, where higher review scores are paired with lower prices. This suggests a lack of a strong relationship between Review Score Ratings and Price.
Given that at least one variable does not exhibit a linear relationship, we will proceed with applying a Generalized Additive Model (GAM) to better capture potential nonlinear interactions.
The GAM model is performed using the training data.
##
## Family: gaussian
## Link function: identity
##
## Formula:
## price ~ s(bathrooms) + s(bedrooms) + s(accommodates) + s(beds) +
## s(latitude) + s(longitude) + s(review_scores_rating) + s(minimum_nights) +
## room_type + neighbourhood
##
## Parametric coefficients:
## Estimate Std. Error t value
## (Intercept) 287.59 99.68 2.885
## room_typePrivate room -46.48 4.57 -10.169
## room_typeShared room -68.94 14.23 -4.844
## neighbourhoodCamp d'en Grassot i Gràcia Nova -148.40 100.97 -1.470
## neighbourhoodCan Baro -103.47 105.54 -0.980
## neighbourhoodCarmel -94.38 103.32 -0.913
## neighbourhoodCiutat Vella -183.32 100.32 -1.827
## neighbourhoodDiagonal Mar - La Mar Bella -129.25 102.61 -1.260
## neighbourhoodDreta de l'Eixample -150.31 99.98 -1.503
## neighbourhoodEixample -166.24 99.83 -1.665
## neighbourhoodEl Baix Guinardó -153.27 102.44 -1.496
## neighbourhoodEl Besòs i el Maresme -169.69 103.04 -1.647
## neighbourhoodEl Bon Pastor -159.11 113.36 -1.404
## neighbourhoodEl Born -179.28 101.33 -1.769
## neighbourhoodEl Camp de l'Arpa del Clot -155.39 101.04 -1.538
## neighbourhoodEl Clot -148.60 102.47 -1.450
## neighbourhoodEl Coll -129.06 118.84 -1.086
## neighbourhoodEl Congrés i els Indians -155.55 107.34 -1.449
## neighbourhoodel Fort Pienc -176.53 100.62 -1.754
## neighbourhoodEl Gòtic -191.99 100.63 -1.908
## neighbourhoodEl Poble-sec -183.61 100.74 -1.823
## neighbourhoodEl Poblenou -144.92 101.55 -1.427
## neighbourhoodEl Putget i Farró -101.97 101.40 -1.006
## neighbourhoodEl Raval -185.13 100.45 -1.843
## neighbourhoodGlòries - El Parc -175.09 101.37 -1.727
## neighbourhoodGràcia -149.17 99.85 -1.494
## neighbourhoodGuinardó -139.29 101.13 -1.377
## neighbourhoodHorta -92.63 135.31 -0.685
## neighbourhoodHorta-Guinardó -121.90 99.70 -1.223
## neighbourhoodL'Antiga Esquerra de l'Eixample -156.20 100.18 -1.559
## neighbourhoodLa Barceloneta -162.37 101.36 -1.602
## neighbourhoodLa Font d'en Fargues -148.44 112.25 -1.322
## neighbourhoodLa Maternitat i Sant Ramon -205.57 99.97 -2.056
## neighbourhoodLa Nova Esquerra de l'Eixample -165.07 100.30 -1.646
## neighbourhoodLa Prosperitat -79.77 137.10 -0.582
## neighbourhoodLa Sagrada Família -170.82 100.21 -1.705
## neighbourhoodLa Sagrera -117.18 104.13 -1.125
## neighbourhoodLa Salut -133.19 102.32 -1.302
## neighbourhoodLa Teixonera -144.49 112.34 -1.286
## neighbourhoodLa Trinitat Vella -183.15 120.73 -1.517
## neighbourhoodLa Verneda i La Pau -155.21 105.30 -1.474
## neighbourhoodLa Vila Olímpica -138.96 102.00 -1.362
## neighbourhoodLes Corts -192.97 99.50 -1.939
## neighbourhoodLes Tres Torres -183.02 107.64 -1.700
## neighbourhoodMontbau -106.58 135.05 -0.789
## neighbourhoodNavas -151.56 103.03 -1.471
## neighbourhoodNou Barris -129.69 100.50 -1.291
## neighbourhoodPedralbes -214.17 117.40 -1.824
## neighbourhoodPorta -134.03 109.63 -1.223
## neighbourhoodProvençals del Poblenou -162.56 103.96 -1.564
## neighbourhoodSant Andreu -127.31 100.18 -1.271
## neighbourhoodSant Andreu de Palomar -128.94 103.53 -1.245
## neighbourhoodSant Antoni -179.28 100.42 -1.785
## neighbourhoodSant Genís dels Agudells -133.29 112.16 -1.188
## neighbourhoodSant Gervasi - Galvany -170.18 100.55 -1.693
## neighbourhoodSant Gervasi - la Bonanova -168.19 119.00 -1.413
## neighbourhoodSant Martí -152.57 100.43 -1.519
## neighbourhoodSant Martí de Provençals -160.63 104.23 -1.541
## neighbourhoodSant Pere/Santa Caterina -180.09 100.60 -1.790
## neighbourhoodSants-Montjuïc -182.58 100.12 -1.824
## neighbourhoodSarrià -177.41 97.98 -1.811
## neighbourhoodSarrià-Sant Gervasi -153.96 99.83 -1.542
## neighbourhoodTrinitat Nova -146.10 136.65 -1.069
## neighbourhoodTuró de la Peira - Can Peguera -112.90 105.94 -1.066
## neighbourhoodVallcarca i els Penitents -78.76 103.08 -0.764
## neighbourhoodVerdum - Los Roquetes -138.45 110.82 -1.249
## neighbourhoodVila de Gràcia -107.88 100.10 -1.078
## neighbourhoodVilapicina i la Torre Llobeta -158.61 109.05 -1.455
## Pr(>|t|)
## (Intercept) 0.00393 **
## room_typePrivate room < 2e-16 ***
## room_typeShared room 1.31e-06 ***
## neighbourhoodCamp d'en Grassot i Gràcia Nova 0.14173
## neighbourhoodCan Baro 0.32696
## neighbourhoodCarmel 0.36105
## neighbourhoodCiutat Vella 0.06770 .
## neighbourhoodDiagonal Mar - La Mar Bella 0.20788
## neighbourhoodDreta de l'Eixample 0.13280
## neighbourhoodEixample 0.09594 .
## neighbourhoodEl Baix Guinardó 0.13464
## neighbourhoodEl Besòs i el Maresme 0.09966 .
## neighbourhoodEl Bon Pastor 0.16051
## neighbourhoodEl Born 0.07692 .
## neighbourhoodEl Camp de l'Arpa del Clot 0.12414
## neighbourhoodEl Clot 0.14705
## neighbourhoodEl Coll 0.27755
## neighbourhoodEl Congrés i els Indians 0.14739
## neighbourhoodel Fort Pienc 0.07944 .
## neighbourhoodEl Gòtic 0.05647 .
## neighbourhoodEl Poble-sec 0.06842 .
## neighbourhoodEl Poblenou 0.15361
## neighbourhoodEl Putget i Farró 0.31463
## neighbourhoodEl Raval 0.06539 .
## neighbourhoodGlòries - El Parc 0.08419 .
## neighbourhoodGràcia 0.13523
## neighbourhoodGuinardó 0.16849
## neighbourhoodHorta 0.49366
## neighbourhoodHorta-Guinardó 0.22151
## neighbourhoodL'Antiga Esquerra de l'Eixample 0.11901
## neighbourhoodLa Barceloneta 0.10927
## neighbourhoodLa Font d'en Fargues 0.18612
## neighbourhoodLa Maternitat i Sant Ramon 0.03981 *
## neighbourhoodLa Nova Esquerra de l'Eixample 0.09986 .
## neighbourhoodLa Prosperitat 0.56069
## neighbourhoodLa Sagrada Família 0.08831 .
## neighbourhoodLa Sagrera 0.26053
## neighbourhoodLa Salut 0.19309
## neighbourhoodLa Teixonera 0.19843
## neighbourhoodLa Trinitat Vella 0.12931
## neighbourhoodLa Verneda i La Pau 0.14054
## neighbourhoodLa Vila Olímpica 0.17316
## neighbourhoodLes Corts 0.05251 .
## neighbourhoodLes Tres Torres 0.08915 .
## neighbourhoodMontbau 0.43001
## neighbourhoodNavas 0.14137
## neighbourhoodNou Barris 0.19693
## neighbourhoodPedralbes 0.06817 .
## neighbourhoodPorta 0.22154
## neighbourhoodProvençals del Poblenou 0.11796
## neighbourhoodSant Andreu 0.20387
## neighbourhoodSant Andreu de Palomar 0.21305
## neighbourhoodSant Antoni 0.07426 .
## neighbourhoodSant Genís dels Agudells 0.23472
## neighbourhoodSant Gervasi - Galvany 0.09060 .
## neighbourhoodSant Gervasi - la Bonanova 0.15761
## neighbourhoodSant Martí 0.12876
## neighbourhoodSant Martí de Provençals 0.12335
## neighbourhoodSant Pere/Santa Caterina 0.07350 .
## neighbourhoodSants-Montjuïc 0.06828 .
## neighbourhoodSarrià 0.07027 .
## neighbourhoodSarrià-Sant Gervasi 0.12307
## neighbourhoodTrinitat Nova 0.28507
## neighbourhoodTuró de la Peira - Can Peguera 0.28659
## neighbourhoodVallcarca i els Penitents 0.44488
## neighbourhoodVerdum - Los Roquetes 0.21160
## neighbourhoodVila de Gràcia 0.28117
## neighbourhoodVilapicina i la Torre Llobeta 0.14587
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(bathrooms) 4.974 6.008 8.753 < 2e-16 ***
## s(bedrooms) 4.372 5.383 3.148 0.00699 **
## s(accommodates) 5.345 6.276 9.863 < 2e-16 ***
## s(beds) 1.804 2.296 2.179 0.10021
## s(latitude) 6.629 7.860 8.555 < 2e-16 ***
## s(longitude) 6.672 7.900 4.342 3.37e-05 ***
## s(review_scores_rating) 4.581 5.497 12.340 < 2e-16 ***
## s(minimum_nights) 3.726 4.438 37.384 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.322 Deviance explained = 33.6%
## -REML = 28894 Scale est. = 8414.4 n = 4913
## R-squared on Test Set: 0.338637
## MAE: 46.30068
## RMSE: 87.05353
## R-squared: 0.338637
##
## Family: gaussian
## Link function: identity
##
## Formula:
## occupancy_rate_30 ~ s(latitude) + s(longitude) + s(bathrooms) +
## s(bedrooms) + s(accommodates) + s(beds) + s(price) + s(minimum_nights) +
## s(review_scores_rating) + (neighbourhood)
##
## Parametric coefficients:
## Estimate Std. Error t value
## (Intercept) 93.9828 29.7302 3.161
## neighbourhoodCamp d'en Grassot i Gràcia Nova -21.6244 30.0530 -0.720
## neighbourhoodCan Baro -57.3858 31.5162 -1.821
## neighbourhoodCarmel -28.6511 30.9223 -0.927
## neighbourhoodCiutat Vella -21.7160 29.9352 -0.725
## neighbourhoodDiagonal Mar - La Mar Bella -22.8594 30.5217 -0.749
## neighbourhoodDreta de l'Eixample -23.5350 29.8054 -0.790
## neighbourhoodEixample -20.8334 29.7577 -0.700
## neighbourhoodEl Baix Guinardó -22.6073 30.4980 -0.741
## neighbourhoodEl Besòs i el Maresme -26.9060 30.6085 -0.879
## neighbourhoodEl Bon Pastor 0.7423 34.0760 0.022
## neighbourhoodEl Born -12.2346 30.2609 -0.404
## neighbourhoodEl Camp de l'Arpa del Clot -20.9535 30.0562 -0.697
## neighbourhoodEl Clot -29.5142 30.5036 -0.968
## neighbourhoodEl Coll -54.0424 35.8055 -1.509
## neighbourhoodEl Congrés i els Indians -14.3995 32.2823 -0.446
## neighbourhoodel Fort Pienc -17.4830 30.0132 -0.583
## neighbourhoodEl Gòtic -19.6208 30.0208 -0.654
## neighbourhoodEl Poble-sec -20.4614 30.0859 -0.680
## neighbourhoodEl Poblenou -11.5547 30.2098 -0.382
## neighbourhoodEl Putget i Farró -24.9957 30.0892 -0.831
## neighbourhoodEl Raval -21.0668 29.9798 -0.703
## neighbourhoodGlòries - El Parc -19.9000 30.2422 -0.658
## neighbourhoodGràcia -18.6348 29.6868 -0.628
## neighbourhoodGuinardó -21.6961 30.1835 -0.719
## neighbourhoodHorta -50.2236 41.1562 -1.220
## neighbourhoodHorta-Guinardó -25.7994 29.7692 -0.867
## neighbourhoodL'Antiga Esquerra de l'Eixample -24.4582 29.8252 -0.820
## neighbourhoodLa Barceloneta -16.4498 30.2493 -0.544
## neighbourhoodLa Font d'en Fargues -27.3096 33.8886 -0.806
## neighbourhoodLa Maternitat i Sant Ramon -21.4781 30.0036 -0.716
## neighbourhoodLa Nova Esquerra de l'Eixample -23.2732 29.9045 -0.778
## neighbourhoodLa Prosperitat -81.0111 41.6938 -1.943
## neighbourhoodLa Sagrada Família -22.1307 29.8259 -0.742
## neighbourhoodLa Sagrera -23.8149 31.1802 -0.764
## neighbourhoodLa Salut -7.2049 30.4009 -0.237
## neighbourhoodLa Teixonera -27.2127 33.8132 -0.805
## neighbourhoodLa Trinitat Vella -34.8536 35.9411 -0.970
## neighbourhoodLa Verneda i La Pau -28.5639 31.5070 -0.907
## neighbourhoodLa Vila Olímpica -21.1895 30.3786 -0.698
## neighbourhoodLes Corts -21.8761 29.6501 -0.738
## neighbourhoodLes Tres Torres -32.0797 32.0809 -1.000
## neighbourhoodMontbau -23.5499 41.0902 -0.573
## neighbourhoodNavas -17.8430 30.7322 -0.581
## neighbourhoodNou Barris -29.3390 29.9159 -0.981
## neighbourhoodPedralbes -29.5077 35.5738 -0.829
## neighbourhoodPorta -23.4399 32.9009 -0.712
## neighbourhoodProvençals del Poblenou -19.1179 30.9081 -0.619
## neighbourhoodSant Andreu -21.4528 29.9403 -0.717
## neighbourhoodSant Andreu de Palomar -20.5631 30.8933 -0.666
## neighbourhoodSant Antoni -21.1097 29.9761 -0.704
## neighbourhoodSant Genís dels Agudells -32.7547 33.7449 -0.971
## neighbourhoodSant Gervasi - Galvany -27.1806 29.8662 -0.910
## neighbourhoodSant Gervasi - la Bonanova -50.0196 35.7186 -1.400
## neighbourhoodSant Martí -18.7222 29.8821 -0.627
## neighbourhoodSant Martí de Provençals -17.4224 31.0041 -0.562
## neighbourhoodSant Pere/Santa Caterina -24.0552 30.0181 -0.801
## neighbourhoodSants-Montjuïc -21.4089 29.8016 -0.718
## neighbourhoodSarrià -23.5800 29.9708 -0.787
## neighbourhoodSarrià-Sant Gervasi -26.5323 29.6603 -0.895
## neighbourhoodTrinitat Nova -25.6116 41.2008 -0.622
## neighbourhoodTuró de la Peira - Can Peguera -13.7704 31.8177 -0.433
## neighbourhoodVallcarca i els Penitents -26.9538 30.6695 -0.879
## neighbourhoodVerdum - Los Roquetes -24.9577 32.9081 -0.758
## neighbourhoodVila de Gràcia -18.4190 29.7581 -0.619
## neighbourhoodVilapicina i la Torre Llobeta -32.3390 32.8714 -0.984
## Pr(>|t|)
## (Intercept) 0.00158 **
## neighbourhoodCamp d'en Grassot i Gràcia Nova 0.47184
## neighbourhoodCan Baro 0.06869 .
## neighbourhoodCarmel 0.35421
## neighbourhoodCiutat Vella 0.46822
## neighbourhoodDiagonal Mar - La Mar Bella 0.45392
## neighbourhoodDreta de l'Eixample 0.42979
## neighbourhoodEixample 0.48390
## neighbourhoodEl Baix Guinardó 0.45857
## neighbourhoodEl Besòs i el Maresme 0.37943
## neighbourhoodEl Bon Pastor 0.98262
## neighbourhoodEl Born 0.68601
## neighbourhoodEl Camp de l'Arpa del Clot 0.48575
## neighbourhoodEl Clot 0.33331
## neighbourhoodEl Coll 0.13128
## neighbourhoodEl Congrés i els Indians 0.65558
## neighbourhoodel Fort Pienc 0.56025
## neighbourhoodEl Gòtic 0.51342
## neighbourhoodEl Poble-sec 0.49647
## neighbourhoodEl Poblenou 0.70212
## neighbourhoodEl Putget i Farró 0.40617
## neighbourhoodEl Raval 0.48228
## neighbourhoodGlòries - El Parc 0.51056
## neighbourhoodGràcia 0.53022
## neighbourhoodGuinardó 0.47229
## neighbourhoodHorta 0.22241
## neighbourhoodHorta-Guinardó 0.38618
## neighbourhoodL'Antiga Esquerra de l'Eixample 0.41223
## neighbourhoodLa Barceloneta 0.58660
## neighbourhoodLa Font d'en Fargues 0.42036
## neighbourhoodLa Maternitat i Sant Ramon 0.47412
## neighbourhoodLa Nova Esquerra de l'Eixample 0.43646
## neighbourhoodLa Prosperitat 0.05207 .
## neighbourhoodLa Sagrada Família 0.45813
## neighbourhoodLa Sagrera 0.44503
## neighbourhoodLa Salut 0.81267
## neighbourhoodLa Teixonera 0.42098
## neighbourhoodLa Trinitat Vella 0.33222
## neighbourhoodLa Verneda i La Pau 0.36467
## neighbourhoodLa Vila Olímpica 0.48551
## neighbourhoodLes Corts 0.46067
## neighbourhoodLes Tres Torres 0.31738
## neighbourhoodMontbau 0.56659
## neighbourhoodNavas 0.56154
## neighbourhoodNou Barris 0.32678
## neighbourhoodPedralbes 0.40687
## neighbourhoodPorta 0.47623
## neighbourhoodProvençals del Poblenou 0.53625
## neighbourhoodSant Andreu 0.47371
## neighbourhoodSant Andreu de Palomar 0.50569
## neighbourhoodSant Antoni 0.48133
## neighbourhoodSant Genís dels Agudells 0.33177
## neighbourhoodSant Gervasi - Galvany 0.36283
## neighbourhoodSant Gervasi - la Bonanova 0.16146
## neighbourhoodSant Martí 0.53099
## neighbourhoodSant Martí de Provençals 0.57418
## neighbourhoodSant Pere/Santa Caterina 0.42296
## neighbourhoodSants-Montjuïc 0.47256
## neighbourhoodSarrià 0.43146
## neighbourhoodSarrià-Sant Gervasi 0.37108
## neighbourhoodTrinitat Nova 0.53422
## neighbourhoodTuró de la Peira - Can Peguera 0.66519
## neighbourhoodVallcarca i els Penitents 0.37953
## neighbourhoodVerdum - Los Roquetes 0.44825
## neighbourhoodVila de Gràcia 0.53597
## neighbourhoodVilapicina i la Torre Llobeta 0.32526
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(latitude) 1.019 1.038 0.178 0.69673
## s(longitude) 2.902 3.807 1.155 0.30559
## s(bathrooms) 3.571 4.461 4.499 0.00101 **
## s(bedrooms) 3.252 4.173 4.445 0.00146 **
## s(accommodates) 4.284 5.237 9.958 < 2e-16 ***
## s(beds) 3.416 4.264 1.596 0.14195
## s(price) 6.027 7.118 28.644 < 2e-16 ***
## s(minimum_nights) 5.898 6.840 6.892 < 2e-16 ***
## s(review_scores_rating) 3.332 4.055 24.927 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.0904 Deviance explained = 10.9%
## -REML = 23250 Scale est. = 817.01 n = 4913
## R-squared on Test Set (Occupancy 30): 0.04477809
## MAE: 64.46847
## RMSE: 66.04594
A Neural Network model is performed using the training data.
## Length Class Mode
## call 5 -none- call
## response 4913 -none- numeric
## covariate 39304 -none- numeric
## model.list 2 -none- list
## err.fct 1 -none- function
## act.fct 1 -none- function
## linear.output 1 -none- logical
## data 9 data.frame list
## exclude 0 -none- NULL
## net.result 1 -none- list
## weights 1 -none- list
## generalized.weights 1 -none- list
## startweights 1 -none- list
## result.matrix 34 -none- numeric
## MAE: 48.05641
## RMSE: 90.50193
## R-squared: 0.2852029
It is used to analyse …. and answer the project question x - Accurancy - Precision - Recall - RMSE (Root Mean Squared Error) - MAE (Mean Absolute Error) - R Squared
– how you used generative AI in redacting the group work (code-related questions, generate text, explain concepts…) – what was easy/hard/impossible to do with generative AI – what you had to pay attention to/be critical about when using the results obtained through the use of generative AI
Table with the description of every variable and the type